Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems
نویسندگان
چکیده
Parallel software for solving the quadratic program arising in training support vector machines for classification problems is introduced. The software implements an iterative decomposition technique and exploits both the storage and the computing resources available on multiprocessor systems, by distributing the heaviest computational tasks of each decomposition iteration. Based on a wide range of recent theoretical advances, relevant decomposition issues, such as the quadratic subproblem solution, the gradient updating, the working set selection, are systematically described and their careful combination to get an effective parallel tool is discussed. A comparison with state-ofthe-art packages on benchmark problems demonstrates the good accuracy and the remarkable time saving achieved by the proposed software. Furthermore, challenging experiments on real-world data sets with millions training samples highlight how the software makes large scale standard nonlinear support vector machines effectively tractable on common multiprocessor systems. This feature is not shown by any of the available codes.
منابع مشابه
Parallel support vector machines on multi-core and multiprocessor systems
This paper proposes a new and efficient parallel implementation of support vector machines based on decomposition method for handling large scale datasets. The parallelizing is performed on the most time-and-memory consuming work of training, i.e., to update the vector f . The inner problems are dealt by sequential minimal optimization solver. Since the underlying parallelism is realized by the...
متن کاملA parallel solver for large quadratic programs in training support vector machines
This work is concerned with the solution of the convex quadratic programming problem arising in training the learning machines named support vector machines. The problem is subject to box constraints and to a single linear equality constraint; it is dense and, for many practical applications, it becomes a large-scale problem. Thus, approaches based on explicit storage of the matrix of the quadr...
متن کاملParallel Decomposition Approaches for Training Support Vector Machines
We consider parallel decomposition techniques for solving the large quadratic programming (QP) problems arising in training support vector machines. A recent technique is improved by introducing an efficient solver for the inner QP subproblems and a preprocessing step useful to hot start the decomposition strategy. The effectiveness of the proposed improvements is evaluated by solving large-sca...
متن کاملProbabilistic Contaminant Source Identification in Water Distribution Infrastructure Systems
Large water distribution systems can be highly vulnerable to penetration of contaminant factors caused by different means including deliberate contamination injections. As contaminants quickly spread into a water distribution network, rapid characterization of the pollution source has a high measure of importance for early warning assessment and disaster management. In this paper, a methodology...
متن کاملA Parallel and Modular Pattern Classification Framework for Large-Scale Problems
The number of samples that are available on the internet to train pattern classifiers is increasing rapidly, while traditional pattern classification techniques based on a single computer system are powerless to process these large-scale data sets. This chapter presents a parallel and modular pattern classification framework for coping with large-scale pattern classification problems. The propo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 7 شماره
صفحات -
تاریخ انتشار 2006